Comparison of Feature Selection Methods in Breast Cancer Microarray Data

نویسندگان

چکیده

Aim: We aim to predict metastasis in breast cancer patients with tree-based conventional machine learning algorithms and observe which feature selection methods is more effective related microarray data reducing the number of features. Material Methods: Feature methods, least squares absolute shrinkage (LASSO), Boruta maximum relevance-minimum redundancy (MRMR) statistical preprocessing steps were first applied before like Decision-tree, Extremely randomized trees Gradient Boosting Tree on data. Results: Microarray 54675 features (202 (101/101 with/without metastases)) was reduced 235 features, then most important found algorithms. It observed that highest recall F-measure values obtained from XGBoost method precision value received Extra-tree method. The 10 arrays out variable importance listed. Conclusion: accurate results preprocessed for Extra-trees Statistical would be enough analysis metastases predictions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diagnosis of Breast Cancer Subtypes using the Selection of Effective Genes from Microarray Data

Introduction: Early diagnosis of breast cancer and the identification of effective genes are important issues in the treatment and survival of the patients. Gene expression data obtained using DNA microarray in combination with machine learning algorithms can provide new and intelligent methods for diagnosis of breast cancer. Methods: Data on the expression of 9216 genes from 84 patients across...

متن کامل

Comparison of feature extraction methods with microarray gene-expression data

With microarray gene-expression data, we compare supervised feature extraction methods with the unsupervised feature extraction methods. From experimental results, it is shown that the supervised feature extraction methods are more powerful than the unsupervised feature extraction methods in terms of class separability.

متن کامل

Feature Selection and Classification of MAQC-II Breast Cancer and Multiple Myeloma Microarray Gene Expression Data

Microarray data has a high dimension of variables but available datasets usually have only a small number of samples, thereby making the study of such datasets interesting and challenging. In the task of analyzing microarray data for the purpose of, e.g., predicting gene-disease association, feature selection is very important because it provides a way to handle the high dimensionality by explo...

متن کامل

Feature Selection by Weighted-SNR for Cancer Microarray Data Classification

Feature selection technique is widely used to improve the high dimensional data analysis especially in a classification task. Cancer microarray data classification task belongs to this category. There are many researches that study the feature selection of microarray data classification. The major problem is that many feature selection methods must pre-define the number of feature. Unfortunatel...

متن کامل

Feature Selection for Cancer Classification Using Microarray Gene Expression Data

The DNA microarray technology enables us to measure the expression levels of thousands of genes simultaneously, providing great chance for cancer diagnosis and prognosis. The number of genes often exceeds tens of thousands, whereas the number of subjects available is often no more than a hundred. Therefore, it is necessary and important to perform gene selection for classification purpose. A go...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Medical records-international medical journal

سال: 2023

ISSN: ['2687-4555']

DOI: https://doi.org/10.37990/medr.1202671